Towards a corpus-based dictionary of German noun-verb collocations
نویسنده
چکیده
We 1 describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation nding tools do not provide other than lexical cooc-currence information, we rst use symbolic extraction tools, based on a regular grammar over part-of-speech tagged and lemmatized text, and we use statistical lters thereafter. We rst list the types of information which should be contained in a collocational dictionary for Natural Language Processing, then sketch our extraction methods and nally discuss and illustrate our initial results.
منابع مشابه
Verb-Noun Collocation SyntLex Dictionary: Corpus-Based Approach
The project presented here is a part of a long term research program aiming at a full lexicon grammar for Polish (SyntLex). The main concern of this project is computer-assisted acquisition and morpho-syntactic description of verb-noun collocations in Polish. We present methodology and resources obtained in three main project phases which are: dictionary-based acquisition of collocation lexicon...
متن کاملCollocations of Complex Nouns: Evidence for Lexicalisation
This paper combines a corpus-based study of noun+verb collocations with an attempt to distinguish compositional, regularly formed compounds from lexicalised ones. We claim that morphologically regular, compositional compounds share most of their collocational preferences with their compound heads, whereas lexicalised compounds have their own collocational preferences, distinct or only marginall...
متن کاملTowards Distributional Semantics-based Classification of Collocations for Collocation Dictionaries
Automatic acquisition of raw source material is of great aid for the compilation of dictionaries, and, in particular, of specialized dictionaries such as collocation dictionaries. The extraction of collocations from corpora has been actively worked on since the late eighties. The quality of the state-of-the-art extraction algorithms allows the lexicographers to obtain lists of collocations they...
متن کاملExtraction of V-N-Collocations from Text Corpora: A Feasibility Study for German
The usefulness of a statistical approach suggested by Church and Hanks (1989) is evaluated for the extraction of verb-noun (V-N) collocations from German text corpora. Some motivations for the extraction of V-N collocations from corpora are given and a couple of differences concerning the German language are mentioned that have implications on the applicability of extraction methods developed f...
متن کاملJapanese Learners’dictionary of I-adjective-noun Collocations
This paper demonstrates a method for creating Japanese learners dictionary of i-adjective-noun collocations. After an introduction of the importance of collocations and the necessity of their inclusion in Japanese language learning, we present various corpora types and corpus query tools that are used to obtain variety of collocational usage in different types of discourse. The Japanese languag...
متن کامل